LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation
arxiv.orgยท12h
๐Ÿง LLM Inference
How Can AI Companies Protect On-Device AI Models and Deliver Updates Efficiently?
reddit.comยท16hยท
Discuss: r/LocalLLaMA
๐Ÿ“ŠModel Serving Economics
AI hardware reimagined for lower energy use
techxplore.comยท3h
โšกHardware Acceleration
Language Models Pack Billions of Concepts into 12,000 Dimensions
nickyoder.comยท13hยท
๐ŸŽฏVector Quantization
How To Build Agentic AI That Ships
thenewstack.ioยท2h
๐Ÿ“ŠModel Serving Economics
Down and out with Cerebras Code
infoworld.comยท7h
๐Ÿ’พPrompt Caching
Analog IMC Attention Mechanism For Fast And Energy-Efficient LLMs (FZJ, RWTH Aachen)
semiengineering.comยท44m
๐Ÿง LLM Inference
How next-gen laptops use NPUs for massive power savings
nordot.appยท6h
๐Ÿ–ฅ๏ธHardware Architecture
Supercharge Your Docker Compose Applications with AI Models
ajeetraina.comยท10h
๐Ÿ†•New AI
Conquering the LLM Memory Wall: How to Run 2โ€“4x Longer Contexts with a Single Line of Code
reddit.comยท5hยท
Discuss: r/LocalLLaMA
๐Ÿง LLM Inference
AI search optimization tools: Geoptieโ€™s all-in-one GEO dashboard is live by Tor.app
searchengineland.comยท5h
๐Ÿ“ŠFeed Optimization
Securing and Scaling AI-Powered APIs
capestart.comยท3hยท
Discuss: Hacker News
๐Ÿง Inference Serving
Productive AI Programming Using Forced Context Distillation
jx0.caยท4hยท
Discuss: Hacker News
๐Ÿช„Prompt Engineering
[CS 2881r AI Safety] [Week 1] Introduction
lesswrong.comยท20h
๐Ÿ›ก๏ธAI Safety
LLM Rerankers for RAG: A Practical Guide
fin.aiยท19hยท
๐Ÿ†Ranking
AI Agent Development Trends 2025: Insights from 542 Projects
greenice.netยท1hยท
Discuss: Hacker News
๐Ÿ†•New AI
Demis Hassabis en el podcast Release Notes
domingogallardo.bearblog.devยท22h
๐Ÿ†•New AI
Cloud CISO Perspectives: APAC security leaders speak out on AI and key topics
cloud.google.comยท58m
๐ŸŒDistributed systems
An AI-Powered Development Workflow for Solo Builders
spin.atomicobject.comยท4h
๐Ÿช„Prompt Engineering
Google Pixel 10 Pro review: one of the very best smaller phones
theguardian.comยท10h
๐Ÿ“ฑNew tech trends